## What is claimed is:

- An apparatus comprising:
  - a first processing unit including
- a first register file operably coupled to one or more first vector arithmetic logic units;
  - a second processing unit including
- a second register file operably coupled to one or more second vector arithmetic logic units;

wherein the first register file has a plurality of cross-connections to at least one of the second vector arithmetic logic units; and

wherein the second register file has a plurality of cross-connections to at least one of the first vector arithmetic logic units.

- 2. The apparatus of claim 1 wherein the plurality of cross connections between the first register file and the one or more second vector arithmetic logic units and the second register file and the one or more first vector arithmetic logic units permits the exchange of a plurality of operands within a single cycle.
- The apparatus of claim 1 wherein at least one of the first and second vector arithmetic logic units utilizes dynamically scalable single instruction multiple data processing.
- 4. The apparatus of claim 1, wherein two or more source vectors of a predetermined length are input into the at least one of the first and second vector arithmetic logic units and two or more destination vectors of the predetermined length are output from the at least one or more of the first and second vector arithmetic logic units.

- The apparatus of claim 1, wherein a vector arithmetic logic unit comprises
  - a first arithmetic logic component and
  - a second arithmetic logic component,

wherein an output of the first arithmetic logic component is coupled with an input of the second arithmetic logic component.

- 6. The apparatus of claim 1, wherein the first vector arithmetic logic unit has access to a plurality of source vectors from the second register file via the plurality of cross connections and the second vector arithmetic logic unit has access to a plurality of source vectors from the first register file via the plurality of cross connections.
- 7. The apparatus of claim 1 wherein the cross-connections between the first vector arithmetic logic unit and the second vector arithmetic logic unit enable a cross multiply operation between at least one operand in the first register file and at least one operand in the second register file.
- 8. An apparatus comprising:
  - at least one vector arithmetic logic unit; and
- a control logic unit that enables conditional control of one or more vector operations on one or more elements of a plurality of elements of a vector by the at least one vector arithmetic logic unit.
- 9. The apparatus of claim 8 wherein the conditional control of one or more vector operations on one or more elements of the plurality of elements of a vector comprises dynamically scalable performance of a plurality of arithmetic or logical operations on one or more data elements of a vector in a single cycle.

- An apparatus comprising:
  - a first single instruction multiple data vector processing unit comprising:
    - a first register file,
- a first vector arithmetic logic unit operably coupled to the first register file, and
- a first vector network unit operably coupled to the first register file; and
- a second single instruction multiple data vector processing unit comprising:
  - a second register file,
- a second vector arithmetic logic unit operably coupled to the second register file,
- a second vector network unit operably coupled to the second register file, and
- a plurality of cross connections between the first register file and the second vector arithmetic logic unit and the second register file and the first vector arithmetic logic unit.
- 11. The apparatus of claim 10 wherein the first vector network unit comprises a vector permute unit and a vector logical operations unit.
- 12. The apparatus of claim 10 wherein the vector permute unit can select any one or more 8 bit portion of at least one input vector and place them into any one or more 8 bit portion of at least one output vector.
- 13. The apparatus of claim 11 wherein the vector permute unit comprises a function decoder for the byte-wise control of the operation performed by a crossbar switch based on a specified address register value or an immediate data value in an instruction code.

14. The apparatus of claim 11 wherein the vector logical operations unit is capable of performing in a single cycle a conditional vector field selection of the fields in two source vectors based on the respective selection flag values in a third source vector, which appends the respective selection flags to the least significant portion of each respective selected field and shifts the result left by one bit position.

## 15. An apparatus comprising:

a vector arithmetic logic unit comprising

a first vector arithmetic unit, capable of performing at least vector addition and subtraction operations operably coupled to

a second vector arithmetic unit, capable of performing at least vector subtraction operations, and

wherein the output of the second vector arithmetic unit is conditionally operably coupled to the input of the first vector arithmetic unit to allow single cycle compound operations.

## 16. A method of vector processing comprising:

providing a first vector processing unit including a first vector arithmetic logic unit and a first vector register file;

providing a second vector processing unit including a second vector arithmetic logic unit and a second vector register file; and

providing a plurality of cross connections between the first vector arithmetic logic unit and the second vector arithmetic logic unit so as to enable a cross multiply operation between at least one operand in the first vector register file and at least one operand in the second vector register file in a single cycle.

17. A method of vector processing comprising:

providing data to a first vector processing unit including

a first register file and

a first vector arithmetic logic unit;

providing data to a second vector processing unit including

a second register file and

a second vector arithmetic logic unit;

wherein the first register file has a plurality of cross connections to the second vector arithmetic logic unit; and

wherein the second register file has a plurality of cross connections to the first vector arithmetic logic unit.